1. Statistical Tests & Methods
Below is a brief introduction to each statistical test and method utilized in this analysis. Sources of information that helped in compiling these summaries are listed in the References section.
ADF (Augmented Dickey-Fuller) Test
The ADF test checks if a time series is stationary by testing for a unit root. The null hypothesis (H_0) is that the series has a unit root (non-stationary), while the alternative hypothesis (H_1) is that the series is stationary. A significant p-value (typically < 0.05) leads to rejecting H_0, indicating that the series is likely stationary.
Phillips-Perron (PP) Test
Like the ADF test, the PP test evaluates whether a time series is stationary, accounting for potential serial correlation and heteroskedasticity in errors. Here, H_0 states that the series has a unit root (non-stationary), while rejecting H_0 indicates stationarity. A significant result supports stationarity.
KPSS (Kwiatkowski-Phillips-Schmidt-Shin) Test
The KPSS test approaches stationarity testing differently, assuming the series is stationary under H_0. If the p-value is small, H_0 is rejected, suggesting the series is non-stationary. This complements ADF and PP tests by testing stationarity as the null.
ERS (Elliott-Rothenberg-Stock) Test
The ERS test also assesses unit roots, particularly when seeking stationarity in the presence of trend and constant terms. H_0 states the series has a unit root (non-stationary), and rejecting H_0 suggests stationarity. This test is often more sensitive than the ADF test.
ACF (Autocorrelation Function)
The ACF measures the correlation of a series with its own lagged values across different time lags. Significant spikes in the ACF plot at specific lags can indicate autocorrelation, providing insight into patterns and cycles within the data, helping identify moving average terms.
PACF (Partial Autocorrelation Function)
PACF measures the correlation between a series and its lagged values after removing correlations attributed to intermediate lags. Significant spikes in the PACF help determine the order of autoregressive terms in models, particularly in ARIMA modeling.
Adjusted R^2
Adjusted R^2 measures the proportion of variance explained by the model, adjusting for the number of predictors to avoid overfitting. A higher Adjusted R^2 indicates a better-fitting model without adding unnecessary complexity.
Anderson-Darling (AD) Test
The AD test is a normality test that gives more weight to the tails of the distribution. Under H_0, the data is normally distributed; rejection indicates significant deviations from normality, especially in the distribution tails.
Jarque-Bera (JB) Test
The JB test evaluates whether skewness and kurtosis of a dataset match a normal distribution. The null hypothesis H_0 is that the data is normally distributed; rejection suggests the presence of non-normality, often due to skewed or heavy-tailed distributions.
Lilliefors Test
A variation of the Kolmogorov-Smirnov test for normality, used when population parameters are unknown. H_0 assumes normality, and a significant p-value indicates a non-normal distribution, especially useful for unknown mean and variance.
Skewness
Skewness measures the asymmetry of data around its mean. A skewness close to zero suggests symmetry, while positive or negative values indicate right or left skew, respectively. Excessive skew may suggest transformations are necessary for normality.
Kurtosis
Kurtosis quantifies the “tailedness” of data, with values near 3 indicating normality (mesokurtic). Higher kurtosis (leptokurtic) suggests heavy tails, while lower kurtosis (platykurtic) indicates light tails.
2. References
“Augmented Dickey Fuller Test (ADF Test) – Must Read Guide.” 2019. Machine Learning Plus (blog). November 2, 2019. https://www.machinelearningplus.com/time-series/augmented-dickey-fuller-test/.
Bevans, Rebecca. 2020. “Akaike Information Criterion | When & How to Use It (Example).” Scribbr. March 26, 2020. https://www.scribbr.com/statistics/akaike-information-criterion.
Datalab, Analyttica. 2019. “What Is Bayesian Information Criterion (BIC)?” Medium. January 16, 2019. https://medium.com/@analyttica/what-is-bayesian-information-criterion-bic-b3396a894be6.
Frost, Jim. 2021. “Autocorrelation and Partial Autocorrelation in Time Series Data.” Statistics by Jim. May 17, 2021. https://statisticsbyjim.com/time-series/autocorrelation-partial-autocorrelation/.
Glen, Stephanie. 2016. “Jarque-Bera Test.” Statistics How To. May 8, 2016. https://www.statisticshowto.com/jarque-bera-test/.
Gonzalez-Barrera, Ana. 2020. “After Surging in 2019, Migrant Apprehensions at U.S.-Mexico Border Fell Sharply in Fiscal 2020.” Pew Research Center. November 4, 2020. https://www.pewresearch.org/short-reads/2020/11/04/after-surging-in-2019-migrant-apprehensions-at-u-s-mexico-border-fell-sharply-in-fiscal-2020-2/.
Hinton, Thomas. 2024. “Infographic: Travel and Tourism Drive close to 10% of the US Economy.” Statista Daily Data. Statista. June 20, 2024. https://www.statista.com/chart/32465/travel-and-tourism-contribution-to-gdp-in-the-us/.
“I-94 Arrivals Program.” 2024. International Trade Administration | Trade.gov. 2024. https://www.trade.gov/i-94-arrivals-program.
IBM. 2023. “Adjusted R Squared.” IBM. January 3, 2023. https://www.ibm.com/docs/en/cognos-analytics/11.1.0?topic=terms-adjusted-r-squared.
“International Travel Receipts and Payments Program.” 2024. International Trade Administration | Trade.gov. 2024. https://www.trade.gov/international-travel-receipts-and-payments-program.
Levin, Adam G. 2023. “U.S. Tourism: Economic Impacts and Pandemic Recovery.” Congressional Research Service. https://crsreports.congress.gov/product/pdf/R/R47857.
Malato, Gianluca. 2023. “An Introduction to the Shapiro-Wilk Test for Normality | Built In.” Builtin. 2023. https://builtin.com/data-science/shapiro-wilk-test.
Olivo, Natalie. 2024. “Medium.” Medium. 2024. https://codeburst.io/cross-validation-calculating-r.
“Pp.test Function - RDocumentation.” n.d. R-Project.org. https://www.rdocumentation.org/packages/aTSA/versions/3.1.2.1/topics/pp.test.
“R: Elliott, Rothenberg and Stock Unit Root Test.” 2024. R-Project.org. 2024. https://search.r-project.org/CRAN/refmans/urca/html/ur.ers.html.
“R: Kwiatkowski-Phillips-Schmidt-Shin Test.” n.d. RDocumentation. https://search.r-project.org/CRAN/refmans/aTSA/html/kpss.test.html.
Scott. 2020. “Parameters Primer: ACF (Autocorrelation Function) - Michigan Metrology.” Michigan Metrology. August 16, 2020. https://michmet.com/parameters-primer-acf-autocorrelation-function.
“Seasonal Adjustments and Deseasonalising Data.” 2024. Mathspace.co. 2024. https://mathspace.co/textbooks/syllabuses/Syllabus-845/topics/Topic-18558/subtopics/Subtopic-251579/?activeTab=theory.
“Skewness | R Tutorial.” 2020. R-Tutor. 2020. https://www.r-tutor.com/elementary-statistics/numerical-measures/skewness.
Stephanie. 2016. “Lilliefors Test for Normality & Exponential Distributions.” Statistics How To. March 8, 2016. https://www.statisticshowto.com/lilliefors-test/.
“The Anderson-Darling Statistic.” n.d. Minitab. https://support.minitab.com/en-us/minitab/help-and-how-to/statistics/basic-statistics/supporting-topics/normality/the-anderson-darling-statistic/.
Zach. 2020. “How to Calculate Skewness & Kurtosis in R.” Statology. October 23, 2020. https://www.statology.org/skewness-kurtosis-in-r/.
Zaheer, Aima. 2023. “25 States with Highest Tourism Revenue in the US.” Yahoo Finance. November 29, 2023. https://finance.yahoo.com/news/25-states-highest-tourism-revenue-151108366.html.